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Abstract 

We present efficient algorithms for constructing a shortest path be- 
tween two states in the Tower of Hanoi graph, and for computing the 
length of the shortest path. The key element is a finite-state machine 
which decides, after examining on the average only || « 1.66 of the 
largest discs, whether the largest disc will be moved once or twice. 
This solves a problem raised by Andreas Hinz, and results in a better 
understanding of how the shortest path is determined. Our algorithm 
for computing the length of the shortest path is typically about twice 
as fast as the existing algorithm. We also use our results to give a new 
derivation of the average distance ||| between two random points on 
the Sierpihski gasket of unit side. 
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1. Introduction 

The Tower of Hanoi puzzle, invented in 1883 by the French mathematician 
Edouard Lucas, has become a classic example in the analysis of algorithms 
and discrete mathematical structures (see e.g. [2J, 1.1). The puzzle consists 
of n discs of different sizes, stacked on three vertical pegs, in such a way 
that no disc lies on top of a smaller disc. A permissible move is to take the 
top disc from one of the pegs and move it to one of the other pegs, as long 
as it is not placed on top of a smaller disc. The set of states of the puzzle, 
together with the permissible moves, thus forms a graph in a natural way. 
The number of vertices in the n-disc Hanoi graph is 3 n . 
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The main question of interest is to find shortest paths in the state graph, 
i.e., shortest sequences of moves leading from a given initial state to a given 
terminal state. The simplest and most well-known case is that in which it 
is required to move all the discs from one of the pegs to another, i.e. where 
the initial and terminal states are two of the three "perfect" states with all 
the discs on the same peg. This is very easy, and can be shown to take 
exactly 2 n — 1 moves. More difficult is to get from a given arbitrary initial 
state to one of the perfect states - Hinz |5| calls this the "pi" problem. 
This takes 2 n — 1 moves in the worst case (which is when the initial state is 
another perfect state), and on the average | • (2 n — 1) moves for a randomly 
chosen initial state jSj- Moreover, there is a simple and efficient algorithm 
to compute the shortest path in this case. 

In the most general case of arbitrary initial and terminal states, however, 
the question of computing the shortest path and its length (the "p2" problem 
0) in the most efficient manner, has not been completely resolved so far. 
(The worst-case behavior is still 2 n — 1 moves, and the average number of 
moves for random initial and terminal states has been shown PQ,[U to be 
about HI • 2 n .) The main obstacle in the understanding of the behavior of 
the shortest path, has been the behavior of the largest disc that "separates" 
the initial and terminal states, i.e. the largest disc which is not on the same 
peg in both states (trivially, any larger discs may simply be ignored). It is 
not difficult to see that in a shortest path, this disc will be moved either 
once (from the source peg to the target peg) or twice (from the source 
to the target, via the third peg). The problem is to decide which of the 
two alternatives is the correct one. Once this is settled, the path may be 
constructed by two applications of the algorithm for the pi problem. Hinz 
[3] proposed an algorithm for the computation of the shortest path based 
on this idea. The algorithm consists essentially of computing the length of 
the path for both alternatives and choosing the shorter of the two. 

In this paper, we propose a more thorough explanation of the process 
whereby it is decided which of the two paths is the shortest. We show that 
it is possible to keep track of the relevant information using a finite-state 
machine, which at each step reads the locations of the next-smaller disc in 
the initial and terminal states, and changes its internal state accordingly. 
Eventually, the machine reaches a terminal state, whereupon it pronounces 
which of the two paths is the shortest. For a random input, its expected 
stopping time is computed to be ||. In other words, after observing on the 
average the locations of just the 1.66 largest discs in the initial and terminal 
states, we will know which of the paths to choose, and we will be able to 
continue using the algorithm for the pi problem. If one is interested just 
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in the length of the shortest path, then our algorithm is typically about 
twice as fast as the algorithm proposed by Hinz |2j (with a small constant 
overhead due to the initial 1.66 discs), since it overrides the need to compute 
both the distance for the path that moves the largest disc once, and the path 
that moves it twice. 

The paper is organized as follows: In the next section, we define the 
discrete Sierpinski gasket graph, a graph which is isomorphic to the Tower 
of Hanoi state graph, but for which the labeling of the vertices is simpler 
to understand. In section 3, we present the main ideas for this graph, and 
then in section 4 show how to translate the results to the Hanoi graph by a 
re-labeling of the vertices. In section 5 we perform a probabilistic analysis 
of the finite-state machine, to compute the average number || of discs that 
need to be read in order to decide whether the largest disc will be moved 
once or twice, and to give a new derivation of the asymptotic value ||| •2 n for 
the average distance between two random states in the n-disc Hanoi graph. 

2. The discrete Sierpinski gasket 

We now define a family of graphs called discrete Sierpinski gaskets. These 
graphs are finite versions of the famous fractal constructed by the Polish 
mathematician Waclaw Sierpinski in 1915. The connection between the 
Tower of Hanoi problem and the Sierpinski gasket was first observed by Ian 
Stewart [H], and was later used by Andreas Hinz and Andreas Schief [7] in 
their calculation of the average distance between points on the Sierpinski 
gasket. 

The nth discrete Sierpinski gasket graph, which we denote by SG n , con- 
sists of the vertex set V(SG n ) = {T,L,R} n (the symbols T,L,R indicate 
"top", "left" and "right", respectively), with the edges defined as follows: 
First, for each x = a n _ia n _2--- a i a o £ V{SG n ) (for reasons that will become 
apparent below, this will be our standard indexing of the coordinates of the 
vertices of SG n ) we have edges connecting x to 

a n _ia n _ 2 ...ai/3, G {T, L, R} \ {a } 

Second, define the tail of x = a n „ia ra _2...ao as the suffix a^ak-i^-aia® of x, 
where k is maximal such that at = at~i = ■■■ = ao- If* x has a tail of length 
k + 1 < n, then x is of the form a n _ia n _2---Ofc+2/?aa...a. Connect x with 
an edge to the vertex 

a n _ia n _ 2 ...a fc+2 a/?/?.../3 

One possible embedding of SG n in the plane is illustrated in Figure 
1 below. This embedding makes clear the meaning of the labeling of the 
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vertices: The first letter (the "most significant digit") signifies whether the 
vertex is in the top, left or right triangles inside the big triangle; the next 
letter locates the vertex within the top, left or right thirds of that triangle, 
etc. 



TTTT 




LLLL LLLR LLRL LLRR LRLL LRLR |_RRR RLLL RLRI_ RLRR RRLL RRLR RRRL RRRR 



Figure 1: The graph SG4 



It will be shown in section 4 that SG n is isomorphic, in a computationally 
straightforward way, to the n-disc Hanoi graph. (The same was shown in 

with less emphasis on explicit computation of the isomorphism.) Thus, 
the problem of shortest paths on the Hanoi graph reduces to that of shortest 
paths in the discrete Sierpihski gasket. We tackle this problem in the next 
section. 
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3. Shortest paths in SG n 

For vertices x, y G V(SG n ), we define the distance d(x, y) to be the length of 
a shortest path from x to y. Our goal is to write down a recursion equation 
for this distance, which is at the heart of the finite-state machine we will 
construct to compute d(x, y). First, let us review briefly some of the known 
facts about d(x, y) in the simple case when y is one of the "perfect" states 
LLL...L, RR...R,TT...T. For concreteness, assume that y = LLL...L, and 
let x = a n -\a n -2-.-aiao G V{SG n ) as before. Then it is known that 

d(x,y)= 2" 

a k ^L 

A simple algorithm exists for computing a shortest path from x to y in this 
case. In the Hanoi labeling of the graph, the algorithm is described in [5J- 
In the current labeling, the algorithm is even simpler and is based on the 
binary number system: if one identifies the symbol L with and the symbols 
R and T with 1, then traversing the edges of the graph becomes equivalent 
to the operations of subtraction or addition of 1 in binary notation. The 
number of steps to reach LL...L = 00. ..0 is then clearly the right-hand side 
in the above equation. 

With these preparatory remarks, we now attack the problem of general 
x = a n _ia n _2-..ao, y = 6 n _i&n-2---&o- First, observe that we may assume 
that a n _i ^ & n -l> since otherwise we may simply consider x and y as vertices 
in the graph SG n -\ (note the self-similar structure in the definition of the 
graph, also apparent in the Tower of Hanoi puzzle when one ignores the 
largest disc). For concreteness, we shall analyze in detail the case where 
a ra _i = T, 6 n _i = R. Referring to Figure 1 for convenience, we see that 

d(x, y) = min ^1 + d(x, TRRR...R) + d(y, RTT...T), 

1 + T~ x + d(x, TLLL...L) + d(y, RLL...L)^j , 

since in a shortest path from x to y, one must go from the top triangle 
to the right triangle either through the edge {TRR...R, RTT...T} (we call 
this Alternative 1, see Theorem 1 below) or through a shortest path from 
LTT...T to LRR...R (Alternative 2) - in the Tower of Hanoi language, this 
is an indication of the fact that in a shortest sequence of moves the largest 
disc must move either once or twice, see [5J. 
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To simplify the next few equations, introduce the following notation: if 
u = c ra _ic n _2.-.co, € {T, L, R} n , let u' = c n _2C ra _3...Co, and define for any 
a G {L, T, R} 

Uu) = Y J ^ k 

Then we have 

d(x, y) = l + min (f R (x f ) + f T (y'), 2 n ^ + f L (x f ) + f L (y f ) 

The recursion equations which will enable us to construct our finite-state 
machine and compute d(x, y) are now given by the following theorem: 



Theorem 1 - The finite state machine. For u = c n _ic n _2-..co, v 
d n -id n -2---do € {T, L, R} n , define the functions 



p(u, v) = min ^f R (u) + fr(v), 2 n + f L (u) + f L (v) 

q(u, v) = min (V + f R (u) + / T ( V ), / L (u) + f L (v) 
r(u,v) = min ( f R (u) + /t(v), + h(v 



(note that p, q, r depend implicitly on the length n of the strings.) Then we 
have the equations 



p(u,v) 



f R (u) + f T (v) 





-1 


= R, d n . 


-l 


= T or 


Cn- 


-1 


= R, d n . 


-l 


= L or 


C71- 


-1 


= R, d n ^ 


-l 


= R or 


Cfi- 


-1 


= L, d n - 


-l 


= T or 


Cn- 


-1 


= T, d n „ 


-l 


= T or 


Cn- 


-1 


= T, d n - 


-i 


= 



(Alternative 1) 



2 n + p(u', v' 
2 n + r(u',v' 



c n _i = T, (i n _i = L or 
Cn-i = L, d n -i = R 

Cn—i = L, d n -i = L 
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q(u,v) 



f L (u) + f L (v) 



r(u, v) 



2 n + q(u',v') 

2 n + r(u',v r ) 
fii(u) + fr(v) 

Mu) + f L (v) 

2 n-l +r (u>, V ') 

2 n + r(u',v') 
2™- 1 +p(u',v') 



2 n - 1 + q(u'v') 



Cn-l 


— L, d n - 


i 


= L or 


On 1 


L . dfi — 


1 
± 


= T or 


Cn-l 


— L, d n ~ 


l 


= R or 


Cn-l 


— -ft, a n ~ 


-i 


= L or 


Cn-l 


= T, d n - 


l 


= L or 


Cn-l 


— T rl 


l 


— .fx 


Cn-l 


= r, 


l 


= T or 


Cn-l 


— it, «n- 


-l 


_ c> 

— IX 


Cn-l 


— -ft, «n- 


-l 


— 1 


Cn-l 


— -K, a n 


-i 


= T 


Cn-l 


— <^n- 


-i 


= L 


Cn-l 


T A 

— J^, «n- 


-i 


= T or 


Cn-l 


= R, d n 


-i 


= L 


Cn-l 


= T, d n . 


-l 


= R 


Cn-l 


= R, d n 


-i 


= Rot 


Cn-l 


= T, d n - 


-l 


= T 


Cn-l 


= T, d n - 


-l 


= L or 


Cn-l 


= L, d n - 


-i 


= R 



(Alternative 2) 



(Alternative 1) 
(Alternative 2) 



These equations will hold even for n = 1 if one defines trivially p(u, u) = 
q(u, v) = r(u, v) = for u, v = G {T, L, = {0}. Alternatives 1,2 in the 
parentheses signify whether the minimum is attained by its first or second 
arguments, respectively. 



Proof. The proof consists simply of inspecting the definitions of the func- 
tions p, q, r and verifying that in each of the cases the required relations 
hold. We omit the details, since the reader would no doubt have to go 
through the same thought process to verify them on her own as to check 
that a purported proof is correct. ■ 



A schematic representation of the finite state machine is shown in Figures 
2, 3 and 4 below. We present several variants of the machine: the machine 
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in Figure 2 only decides between Alternative 1 and Alternative 2. The 
machine in Figure 3, which has auxiliary counters for the distance and for 
the variable n (so strictly speaking it is not really a finite-state automaton), 
actually computes d(x,y). Note that this is still designed only for the case 
in which x begins with the symbol T and y begins with R. Figure 4 shows 
the complete finite-state machine which may be used to compute d(x, y) 
for any two states x,y € V(SG n ). This includes an initial component that 
discards the first few symbols which are identical for x and y, and another 
component that permutes the symbols T, L, R to fit the design of the basic 
machine in Figure 2. 




LR 



RR 



Figure 2: The finite state machine: deciding between Alternative 1 and 
Alternative 2. The two letters signify the two inputs from x and y, reading 
at each step the next-most-significant symbol. The parentheses in the non- 
terminal states indicate that if the input terminates without a decision, then 
in the start state Alternative 1 wins, in the rightmost state Alternative 2 
wins, and in the middle state there is a draw, meaning that the shortest 
path is not unique and both alternatives are valid. 
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Figure 3: The finite state machine: computing d(x, y). Add to d the number 
on each edge traversed, decrease n by 1 and replace x by x' and y by y'. 
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TR 



deterministic 

transition 

(no input read) 




Figure 4: The finite state machine: computing d(x,y), the general case. 
Follow instructions as for Figure 3. For the deterministic transition, do not 
read input or decrease n. 

4. Translating between the Hanoi graph and SG n 

We now define the graph of states in the n-disc Tower of Hanoi puzzle, and 
show that it is isomorphic to SG n . The isomorphism may be computed 
by reading sequentialy the locations of the discs, starting with the largest 
one (which corresponds to the most significant digit in the Sierpihski gasket 
labeling), and following a diagram of permutations translating the labels 
of the three pegs into the symbols T,L,R (another finite-state machine!). 
Together with the results of the previous section, this will give an effective 
means of computing the length of the shortest path between any two vertices 
in the Hanoi graph, and of deciding whether the largest disc will be moved 
once or twice in a shortest path. After that, we describe briefly an algo- 
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rithm for actually constructing the shortest path, based on the algorithm 
for getting to a perfect state. 

Label the three pegs in the Tower of Hanoi with the symbols 0, 1, 2. Since 
in a legal state, on each of the pegs the discs are arranged in increasing size 
from top to bottom, a state is described uniquely by specifying, for any disc, 
the label of its peg. Thus, we define H n , the ra-th Hanoi graph, to be the 
graph whose vertex set is the set V(H n ) = {0, 1, 2} n (with the coordinates of 
the vectors specifying, from left to right, the labels of the pegs of the largest 
disc, second-largest disc, etc.), and where edges between states correspond 
to permissible moves. Figure 5 shows the graph H4. 



0000 




Figure 5: The graph H4 
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The isomorphism between H n and SG n is now described by the following 
theorem: 

Theorem 2. H n and SG n are isomorphic graphs. The finite-state ma- 
chine shown below translates a Hanoi state s G {0, l,2} n into a Sierpihski 
gasket labeling z G {T, L,R} n , by reading the digits from left to right and 
outputting the symbols T,L, R at each step according to the identifications 
in its internal state, then changing the internal state according to the input. 




Figure 6: Computing the isomorphism between H n and SG n 

Proof. This is Lemma 2 in {7|. There it was claimed simply that H n and 
SG n are isomorphic, but the proof, which is by induction, actually describes 
how to compute the isomorphism, and this is easily seen to be equivalent to 
our finite-state machine formulation. ■ 

Summary. By running the machines of Figures 4 and 6 in parallel, we 
now have an algorithm for computing d(x, y) for two arbitrary states in the 
Hanoi graph, and for solving the decision problem for the largest disc, i.e. to 
decide whether the largest disc which it is necessary to move will move once 
or twice. As we will show in the next section, when x and y are randomly 
chosen states, the expected stopping time of the machine is 63/38. (This 
random variable even has an exponential tail distribution, so with very high 
probability only a small number of discs will need to be read to solve the 
decision problem.) Having solved the decision problem, the shortest path 
may now be computed in a straightforward manner, as described in [Sj, using 
the algorithm for getting to a perfect state (use the algorithm described in 
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[3], or the algorithm for the Sierpihski gasket described in section 2 together 
with the machine of Figure 6 - which incidentally leads to an algorithm for 
getting to perfect states which we have not found in the literature). 



5. The case of random inputs 

5.1. How many discs must be read to solve the decision problem? 

In this section, we calculate the average number of discs that must be read 
in order to decide whether in a shortest path the largest disc will be moved 
once or twice. Let x = a n _ia n _ 2 ---«o G V(H n ), y = 6 n _i6 n _ 2 ...6 G V(H n ). 
Assume that we have already discarded the largest discs which for x and y 
were on the same peg, so that a n _i ^ b n —\. The algorithm for solving the 
decision problem then tells us to run the machines of Figures 4 and 6 until 
they reach a terminal state (or we run out of input). Since we have already 
initialized by discarding irrelevant discs, we will really be using the machine 
of Figure 2 (keeping track of the correct identification of the symbols L, T, R 
with the pegs 0,1,2). Since we are dealing with random inputs, what we 
are really interested in is the absorption time of the Markov chain whose 
transition matrix is 



1 


/ 2/9 


1/9 





2/3 


o \ 


2 


2/9 


1/3 


2/9 


1/9 


1/9 


3 





1/9 


2/9 





2/3 


4 











1 





5 


V o 











1 / 



into the terminal states 4 and 5. We may identify these two states to get 
the simpler matrix 



1 


/ 2/9 


1/9 





2/3 \ 


2 


2/9 


1/3 


2/9 


2/9 


3 





1/9 


2/9 


2/3 


(45) 


\o 








1 / 



For i = 1,2,3, denote by U the expected time to get to state (45), starting 
from state i. Then clearly we have the equations 

2 1 
h = l + -h + -h 
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1 2 




It may easily be verified that the solution to this system of equations is 



The value t\ = 63/38 is our expected stopping time, since i = 1 corresponds 
to the initial state. Note that this value is the limit as n — > oo of the average 
number of discs that must be read; in reality, for finite n the value will be 
slightly smaller since after n steps we run out of input and the machine 
terminates even if it has not reached a terminal state. To summarize: 

Theorem 3. The decision problem for shortest paths can be solved in 
average time 0(1). Specifically, the average number of disc pairs that our 
algorithm must read, once identical discs have been discarded, is bounded 
from above by, and converges as n — > oo to, 63/38. 

5.2. The average distance between points on the Sierpinski gasket 

Hinz and Schief [Jj computed the average length 466 /885 of a shortest path 
between two random points on the infinite Sierpinski gasket of unit side. 
An equivalent result of Hinz [3] and of Chan p], in terms of the Tower of 
Hanoi, is that the average number of moves in a shortest path between two 
random states in the n-disc Tower of Hanoi, is asymptotically (466/885) ■ 2 n 
as n — > oo. 

Without going into too much detail, we show that it is possible to obtain 
the value of 466/885 just by looking at the finite-state machine of Figure 
4. Since we are dealing with the infinite gasket, we start with n = and, 
as before, decrease the value of n after each step, so that n will go into the 
negative integers. Let ^1,^2,^3,^4 be the expected accumulated values of 
the variable d if one starts the machine, with initial values n = 0, d = 0, 
at either of the four non-terminal states, in order of their distance from the 
state start (so d\ is the total distance; c?2 is the distance after discarding 
identical most-significant digits of x and y, etc.). Then we have the equations 



h = 63/38, t 2 = 99/38, t 3 = 63/38 
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If 1 \ 2 / 1 , \ 2 (\ 2 



) 



The value (1/2+2/3) in the second and fourth equations is the expected 
value of fc{x) + fA(y) (respectively /b(x) + /s(y)), given that the first pair 
of inputs is one of the six values AC,AA,BA,CC,CB,CA (respectively 
BB, BA, BC, CB, AB, AC). 

Again, it may verified that the solution to this system of equations is 



which gives our claimed value for d±. 
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